There are three parts to
Rhythm and Transforms.
There are chapters about music
theory, practice, and composition.
There are chapters about the psychology
and makeup of listeners, and there are chapters
about the technologies involved in
finding rhythms.
Music:
Chapter 2 discusses some of the many ways people think about and notate rhythmic patterns.
Chapter 3
surveys the musics of the world and shows many
different ways of conceptualizing the use of
rhythmic sound.
Perception: The primary difficulty with the automated detection of rhythms is that the beat is not directly present in the musical signal; it is in the mind of the listener. Hence it is necessary to understand and model the basic perceptual apparatus of the listener.
Chapter 4 describes some of the basic perceptual laws
that underlie rhythmic sound.
There are three approaches to the beat finding
problem: transforms, adaptive oscillators, and statistical methods.
Each makes a different set of assumptions about the
nature of the problem, uses a different kind of mathematics,
and has different strengths, weaknesses, and areas of applicability.
Despite the diversity of the approaches, there are some
common themes: the identification of the period and the phase
of the rhythmic phenomena and the use of certain kinds of
optimization strategies.
Transforms: The transforms of
Chapter 5 model a signal as a collection of waveforms with special form. The Fourier transform presumes that the signal can be modelled as a sum of sinusoidal oscillations. Wavelet transforms operate under the assumption that the signal can be decomposed into a collection of scaled and stretched copies of a single mother wavelet. The
periodicity transforms presume that the signal contains a strong periodic component and decomposes it under this assumption. When these assumptions hold, then there is a good chance that the methods work well when applied to the search for repetitive phenomena. When the assumptions fail, so do the methods.
Adaptive Oscillators: The dynamical system approach of
Chapter 6
views a musical signal (or a feature vector derived from that signal)
as a kind of clock.
The system contains one or more oscillators, which
are also a kind of clock. The trick
is to find a way of coupling the music-clock
to the oscillator-clock so that they synchronize.
Once achieved, the beats can be read directly from the
output of the synchronized oscillator.
Many such coupled-oscillator systems are in
common use: phase locked loops are dynamic oscillators that
synchronize the carrier signal at a receiver to the carrier
signal at a transmitter, the "seek" button on a radio
engages an adaptive system that scans through a
large number of possible stations and locks onto one
that is powerful enough for clear reception,
timing recovery is a standard trick used in cell phones
to align the received bits into sensible packets,
clever system design within the power grid automatically synchronizes
the outputs of electrical generators (rotating machines that
are again modelled as oscillators) even though they may be
thousands of miles apart. Thus synchronization technologies
are well developed in certain fields, and there is hope that
insights from these may be useful in the rhythm finding problem.
Statistical Methods: The models of
Chapter 7
relate various characteristics of a musical signal to the
probability of occurrence of features of interest.
For example, a repetitive pulse of energy at equidistant times
is a characteristic of a signal that is likely to represent
the presence of a beat; a collection of harmonically related
overtones is a characteristic that likely represents the presence
of a musical instrument playing a particular note.
Once a probabilistic (or generative) model
is chosen, techniques such as Kalman filters and
Bayesian particle filtering can be used to estimate
the parameters within the models, for instance, the
times between successive beats.
Beat Tracking:
Chapter 8 applies the three technologies
for locating rhythmic patterns (transforms, adaptive oscillators,
and statistical methods)
to three levels of processing: to symbolic patterns where
the underlying pulse is fixed (e.g., a musical score),
to symbolic patterns where the underlying pulse may vary
(e.g., MIDI data), and to time series data where the pulse
may be both unknown and time varying (e.g., feature vectors derived from audio).
The result is a tool that tracks the beat of
a musical performance.
Beat-Based Signal Processing: The beat timepoints are used in
Chapter 9 as a way to
intelligently segment the musical signal.
Signal processing techniques can be applied on a
beat-by-beat basis: beat-synchronized filters, delay lines,
and special effects, beat-based spectral mappings with
harmonic and/or inharmonic destinations, beat-synchronized
transforms. This chapter introduces
several new kinds of beat-oriented sound manipulations.
Beat-Based Musical Recomposition:
Chapter 10 shows how the beats of a single piece may be
rearranged and reorganized to create new structures and rhythmic patterns
including the creation of beat-based "variations on a theme."
Beats from different pieces can be combined in a cross-performance
synthesis.
Beat-Based Rhythmic Analysis: Traditional musical analysis often focuses on the use of note-based musical scores. Since scores only exist for a small subset of the world's music, it is helpful to be able to analyze performances directly, to probe both the literal and the symbolic levels.
Chapter 11 creates skeletal
rhythm scores that capture some of the salient
aspects of the rhythm. By conducting analyses in a beat-synchronous
manner, it is possible to track changes in a number of
psychoacoustically significant musical variables.